31 research outputs found
Learning context-aware adaptive solvers to accelerate quadratic programming
Convex quadratic programming (QP) is an important sub-field of mathematical
optimization. The alternating direction method of multipliers (ADMM) is a
successful method to solve QP. Even though ADMM shows promising results in
solving various types of QP, its convergence speed is known to be highly
dependent on the step-size parameter . Due to the absence of a general
rule for setting , it is often tuned manually or heuristically. In this
paper, we propose CA-ADMM (Context-aware Adaptive ADMM)) which learns to
adaptively adjust to accelerate ADMM. CA-ADMM extracts the
spatio-temporal context, which captures the dependency of the primal and dual
variables of QP and their temporal evolution during the ADMM iterations.
CA-ADMM chooses based on the extracted context. Through extensive
numerical experiments, we validated that CA-ADMM effectively generalizes to
unseen QP problems with different sizes and classes (i.e., having different QP
parameter structures). Furthermore, we verified that CA-ADMM could dynamically
adjust considering the stage of the optimization process to accelerate
the convergence speed further.Comment: 9 pages, 4 figure
Learning scalable and transferable multi-robot/machine sequential assignment planning via graph embedding
Can the success of reinforcement learning methods for simple combinatorial
optimization problems be extended to multi-robot sequential assignment
planning? In addition to the challenge of achieving near-optimal performance in
large problems, transferability to an unseen number of robots and tasks is
another key challenge for real-world applications. In this paper, we suggest a
method that achieves the first success in both challenges for robot/machine
scheduling problems.
Our method comprises of three components. First, we show a robot scheduling
problem can be expressed as a random probabilistic graphical model (PGM). We
develop a mean-field inference method for random PGM and use it for Q-function
inference. Second, we show that transferability can be achieved by carefully
designing two-step sequential encoding of problem state. Third, we resolve the
computational scalability issue of fitted Q-iteration by suggesting a heuristic
auction-based Q-iteration fitting method enabled by transferability we
achieved.
We apply our method to discrete-time, discrete space problems (Multi-Robot
Reward Collection (MRRC)) and scalably achieve 97% optimality with
transferability. This optimality is maintained under stochastic contexts. By
extending our method to continuous time, continuous space formulation, we claim
to be the first learning-based method with scalable performance among
multi-machine scheduling problems; our method scalability achieves comparable
performance to popular metaheuristics in Identical parallel machine scheduling
(IPMS) problems
WATTNet: Learning to Trade FX via Hierarchical Spatio-Temporal Representation of Highly Multivariate Time Series
Finance is a particularly challenging application area for deep learning
models due to low noise-to-signal ratio, non-stationarity, and partial
observability. Non-deliverable-forwards (NDF), a derivatives contract used in
foreign exchange (FX) trading, presents additional difficulty in the form of
long-term planning required for an effective selection of start and end date of
the contract. In this work, we focus on tackling the problem of NDF tenor
selection by leveraging high-dimensional sequential data consisting of spot
rates, technical indicators and expert tenor patterns. To this end, we
construct a dataset from the Depository Trust & Clearing Corporation (DTCC) NDF
data that includes a comprehensive list of NDF volumes and daily spot rates for
64 FX pairs. We introduce WaveATTentionNet (WATTNet), a novel temporal
convolution (TCN) model for spatio-temporal modeling of highly multivariate
time series, and validate it across NDF markets with varying degrees of
dissimilarity between the training and test periods in terms of volatility and
general market regimes. The proposed method achieves a significant positive
return on investment (ROI) in all NDF markets under analysis, outperforming
recurrent and classical baselines by a wide margin. Finally, we propose two
orthogonal interpretability approaches to verify noise stability and detect the
driving factors of the learned tenor selection strategy.Comment: Submitted to the Thirty-Fourth AAAI Conference on Artificial
Intelligence (AAAI 20
Genetic Algorithms with Neural Cost Predictor for Solving Hierarchical Vehicle Routing Problems
When vehicle routing decisions are intertwined with higher-level decisions,
the resulting optimization problems pose significant challenges for
computation. Examples are the multi-depot vehicle routing problem (MDVRP),
where customers are assigned to depots before delivery, and the capacitated
location routing problem (CLRP), where the locations of depots should be
determined first. A simple and straightforward approach for such hierarchical
problems would be to separate the higher-level decisions from the complicated
vehicle routing decisions. For each higher-level decision candidate, we may
evaluate the underlying vehicle routing problems to assess the candidate. As
this approach requires solving vehicle routing problems multiple times, it has
been regarded as impractical in most cases. We propose a novel
deep-learning-based approach called Genetic Algorithm with Neural Cost
Predictor (GANCP) to tackle the challenge and simplify algorithm developments.
For each higher-level decision candidate, we predict the objective function
values of the underlying vehicle routing problems using a pre-trained graph
neural network without actually solving the routing problems. In particular,
our proposed neural network learns the objective values of the HGS-CVRP
open-source package that solves capacitated vehicle routing problems. Our
numerical experiments show that this simplified approach is effective and
efficient in generating high-quality solutions for both MDVRP and CLRP and has
the potential to expedite algorithm developments for complicated hierarchical
problems. We provide computational results evaluated in the standard benchmark
instances used in the literature
A Neural Separation Algorithm for the Rounded Capacity Inequalities
The cutting plane method is a key technique for successful branch-and-cut and
branch-price-and-cut algorithms that find the exact optimal solutions for
various vehicle routing problems (VRPs). Among various cuts, the rounded
capacity inequalities (RCIs) are the most fundamental. To generate RCIs, we
need to solve the separation problem, whose exact solution takes a long time to
obtain; therefore, heuristic methods are widely used. We design a
learning-based separation heuristic algorithm with graph coarsening that learns
the solutions of the exact separation problem with a graph neural network
(GNN), which is trained with small instances of 50 to 100 customers. We embed
our separation algorithm within the cutting plane method to find a lower bound
for the capacitated VRP (CVRP) with up to 1,000 customers. We compare the
performance of our approach with CVRPSEP, a popular separation software package
for various cuts used in solving VRPs. Our computational results show that our
approach finds better lower bounds than CVRPSEP for large-scale problems with
400 or more customers, while CVRPSEP shows strong competency for problems with
less than 400 customers
Multi-Agent Actor-Critic with Hierarchical Graph Attention Network
Most previous studies on multi-agent reinforcement learning focus on deriving
decentralized and cooperative policies to maximize a common reward and rarely
consider the transferability of trained policies to new tasks. This prevents
such policies from being applied to more complex multi-agent tasks. To resolve
these limitations, we propose a model that conducts both representation
learning for multiple agents using hierarchical graph attention network and
policy learning using multi-agent actor-critic. The hierarchical graph
attention network is specially designed to model the hierarchical relationships
among multiple agents that either cooperate or compete with each other to
derive more advanced strategic policies. Two attention networks, the
inter-agent and inter-group attention layers, are used to effectively model
individual and group level interactions, respectively. The two attention
networks have been proven to facilitate the transfer of learned policies to new
tasks with different agent compositions and allow one to interpret the learned
strategies. Empirically, we demonstrate that the proposed model outperforms
existing methods in several mixed cooperative and competitive tasks.Comment: Accepted as a conference paper at the Thirty-Fourth AAAI Conference
on Artificial Intelligence (AAAI-20), New York, US